Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save memory when numeric terms agg is not top #55873

Merged
merged 21 commits into from
May 8, 2020

Conversation

nik9000
Copy link
Member

@nik9000 nik9000 commented Apr 28, 2020

Right now all implementations of the terms agg allocate a new
Aggregator per bucket. This uses a bunch of memory. Exactly how much
isn't clear but each Aggregator ends up making its own objects to read
doc values which have non-trivial buffers. And it forces all of it
sub-aggregations to do the same. We allocate a new Aggregator per
bucket for two reasons:

  1. We didn't have an appropriate data structure to track the
    sub-ordinals of each parent bucket.
  2. You can only make a single call to runDeferredCollections(long...)
    per Aggregator which was the only way to delay collection of
    sub-aggregations.

This change adds a way to return "deferred" aggregations from any
bucket aggregation and undefers them as part of regular collections.
This mechanism allows you to defer without
runDeferredCollections(long...).

This change also adds a fairly simplistic data structure to track the
sub-ordinals for long-keyed buckets.

It uses both of those to power numeric terms aggregations and removes
the per-bucket allocation of their Aggregator. This fairly
substantially reduces memory consumption of numeric terms aggregations
that are not the "top level", especially when those aggregations contain
many sub-aggregations.

I picked numeric terms aggregations because those have the simplest
implementation. At least, I could kind of fit it in my head. And I
haven't fully understood the "bytes"-based terms aggregations, but I
imagine I'll be able to make similar optimizations to them in follow up
changes.

Right now all implementations of the `terms` agg allocate a new
`Aggregator` per bucket. This uses a bunch of memory. Exactly how much
isn't clear but each `Aggregator` ends up making its own objects to read
doc values which have non-trivial buffers. And it forces all of it
sub-aggregations to do the same. We allocate a new `Aggregator` per
bucket for two reasons:

1. We didn't have an appropriate data structure to track the
   sub-ordinals of each parent bucket.
2. You can only make a single call to `runDeferredCollections(long...)`
   per `Aggregator` which was the only way to delay collection of
   sub-aggregations.

This change adds a way to return "deferred" aggregations from any
bucket aggregation and undefers them as part of regular collections.
This mechanism allows you to defer without
`runDeferredCollections(long...)`.

This change also adds a fairly simplistic data structure to track the
sub-ordinals for `long`-keyed buckets.

It uses both of those to power numeric `terms` aggregations and removes
the per-bucket allocation of their `Aggregator`. This fairly
substantially reduces memory consumption of numeric `terms` aggregations
that are not the "top level", especially when those aggregations contain
many sub-aggregations.

I picked numeric `terms` aggregations because those have the simplest
implementation. At least, I could kind of fit it in my head. And I
haven't fully understood the "bytes"-based terms aggregations, but I
imagine I'll be able to make similar optimizations to them in follow up
changes.
@nik9000
Copy link
Member Author

nik9000 commented Apr 28, 2020

@elasticmachine, test this pleas

nik9000 added a commit to nik9000/rally-tracks that referenced this pull request Apr 29, 2020
Adds a rally track specifically for testing the performance of the terms
agg as I'll be doing some work on it. In particular this focuses on
numeric terms because the first phase of my work only touches them.

Relates to elastic/elasticsearch#55873
@nik9000
Copy link
Member Author

nik9000 commented Apr 29, 2020

This looks like it might actually speed up nested numeric terms.

Before:

|90th percentile service time | keyword_terms_numeric_terms |     3567.84 |     ms |
|90th percentile service time | numeric_terms_numeric_terms |     2461.33 |     ms |
|90th percentile service time |    date_histo_numeric_terms |     3950.39 |     ms |

After:

|90th percentile service time | keyword_terms_numeric_terms |     3344.94 |     ms |
|90th percentile service time | numeric_terms_numeric_terms |     2090.58 |     ms |
|90th percentile service time |    date_histo_numeric_terms |     2623.48 |     ms |

I kind of figured that it'd be a little faster. But these numbers are more than I'd though.... They look real though.

@@ -158,19 +158,27 @@ public final InternalAggregation reducePipelines(

@Override
public InternalAggregation copyWithRewritenBuckets(Function<InternalAggregations, InternalAggregations> rewriter) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can revert this.

@nik9000
Copy link
Member Author

nik9000 commented May 3, 2020

I've rebuilt this, replacing delaying building results with reworking agg results to be built for the entire aggregator at once. I've push some code that looks to mostly work and will be cleaning it up soon!

@@ -174,6 +373,9 @@ public Aggregator resolveSortPath(AggregationPath.PathElement next, Iterator<Agg

@Override
public BucketComparator bucketComparator(String key, SortOrder order) {
if (false == this instanceof SingleBucketAggregator) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was missing from some work that I did previously and removing the wrapping of LongTermsAggregator revealed it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this we get strange test failures around sorting.

nik9000 added a commit that referenced this pull request Jun 2, 2020
Saves memory when the `geotile_grid` and `geohash_grid` are not on the
top level by using the `LongKeyedBucketOrds` we built in #55873.
hub-cap pushed a commit to elastic/rally-tracks that referenced this pull request Jun 3, 2020
Adds a rally track specifically for testing the performance of the bucket
aggs when they are "sub" aggs.

Relates to elastic/elasticsearch#55873
hub-cap pushed a commit to elastic/rally-tracks that referenced this pull request Jun 3, 2020
Adds a rally track specifically for testing the performance of the bucket
aggs when they are "sub" aggs.

Relates to elastic/elasticsearch#55873
hub-cap pushed a commit to elastic/rally-tracks that referenced this pull request Jun 3, 2020
Adds a rally track specifically for testing the performance of the bucket
aggs when they are "sub" aggs.

Relates to elastic/elasticsearch#55873
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Jun 9, 2020
Reworks the `parent` and `child` aggregation are not at the top level
using the optimization from elastic#55873. Instead of wrapping all
non-top-level `parent` and `child` aggregators we now handle being a
child aggregator in the aggregator, specifically by adding recording
which global ordinals show up in the parent and then checking if they
match the child.
nik9000 added a commit that referenced this pull request Jun 10, 2020
Reworks the `parent` and `child` aggregation are not at the top level
using the optimization from #55873. Instead of wrapping all
non-top-level `parent` and `child` aggregators we now handle being a
child aggregator in the aggregator, specifically by adding recording
which global ordinals show up in the parent and then checking if they
match the child.
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Jun 10, 2020
Reworks the `parent` and `child` aggregation are not at the top level
using the optimization from elastic#55873. Instead of wrapping all
non-top-level `parent` and `child` aggregators we now handle being a
child aggregator in the aggregator, specifically by adding recording
which global ordinals show up in the parent and then checking if they
match the child.
nik9000 added a commit that referenced this pull request Jun 10, 2020
Reworks the `parent` and `child` aggregation are not at the top level
using the optimization from #55873. Instead of wrapping all
non-top-level `parent` and `child` aggregators we now handle being a
child aggregator in the aggregator, specifically by adding recording
which global ordinals show up in the parent and then checking if they
match the child.
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Jun 10, 2020
This uses the optimization that we started making in elastic#55873 for
`rare_terms` to save a bit of memory when that aggregation is not on the
top level.
nik9000 added a commit that referenced this pull request Jun 12, 2020
This uses the optimization that we started making in #55873 for
`rare_terms` to save a bit of memory when that aggregation is not on the
top level.
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Jun 12, 2020
This uses the optimization that we started making in elastic#55873 for
`rare_terms` to save a bit of memory when that aggregation is not on the
top level.
nik9000 added a commit that referenced this pull request Jun 12, 2020
This uses the optimization that we started making in #55873 for
`rare_terms` to save a bit of memory when that aggregation is not on the
top level.
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Jun 15, 2020
This merges the aggregator for `significant_text` into
`significant_terms`, applying the optimization built in elastic#55873 to save
memory when the aggregation is not on top. The `significant_text`
aggregation is pretty memory intensive all on its own and this doesn't
particularly help with that, but it'll help with the memory usage of any
sub-aggregations.
nik9000 added a commit that referenced this pull request Jun 18, 2020
This merges the aggregator for `significant_text` into
`significant_terms`, applying the optimization built in #55873 to save
memory when the aggregation is not on top. The `significant_text`
aggregation is pretty memory intensive all on its own and this doesn't
particularly help with that, but it'll help with the memory usage of any
sub-aggregations.
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Jun 18, 2020
This merges the aggregator for `significant_text` into
`significant_terms`, applying the optimization built in elastic#55873 to save
memory when the aggregation is not on top. The `significant_text`
aggregation is pretty memory intensive all on its own and this doesn't
particularly help with that, but it'll help with the memory usage of any
sub-aggregations.
nik9000 added a commit that referenced this pull request Jun 23, 2020
This merges the aggregator for `significant_text` into
`significant_terms`, applying the optimization built in #55873 to save
memory when the aggregation is not on top. The `significant_text`
aggregation is pretty memory intensive all on its own and this doesn't
particularly help with that, but it'll help with the memory usage of any
sub-aggregations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v7.9.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants